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Abstract 

Many complex systems are organized in the form of a network embedded in 
space. Important examples include the physical Internet infrastucture, road net- 
works, flight connections, brain functional networks and social networks. The effect 
of space on network topology has recently come under the spotlight because of the 
emergence of pervasive technologies based on geo-localization, which constantly fill 
databases with people's movements and thus reveal their trajectories and spatial 
behaviour. Extracting patterns and regularities from the resulting massive amount 
of human mobility data requires the development of appropriate tools for uncover- 
ing information in spatially-embedded networks. In contrast with most works that 
tend to apply standard network metrics to any type of network, we argue in this 
paper for a careful treatment of the constraints imposed by space on network topol- 
ogy. In particular, we focus on the problem of community detection and propose 
a modularity function adapted to spatial networks. We show that it is possible to 
factor out the effect of space in order to reveal more clearly hidden structural simi- 
larities between the nodes. Methods are tested on a large mobile phone network and 
computer-generated benchmarks where the effect of space has been incorporated. 



*Published as PNAS, 2011, 108, 7663-7668. arXiv: 1012.3409. Original title: "Beyond Space for 
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Understanding the principles driving the organization of complex networks is crucial 
for a broad range of fields including information and social sciences, economics, biology 
and neuroscience [1]. In networks where nodes occupy positions in an Euclidian space, 
spatial constraints may have a strong effect on their connectivity patterns [2] . Edges may 
either be spatially-embedded, such as in roads or railway lines in transportation networks 
or cables in a power grid, or abstract entities, such as friendship relations in online and 
offline social networks or functional connectivity in brain networks. In either case, space 
plays a crucial role by affecting, directly or indirectly, network connectivity and making 
its architecture radically different from that of random networks [3] . A crucial difference 
stems from the cost associated to long-distance links [H El El [71 El [9l [TOl [HI [12] which 
restricts the existence of hubs, i.e. high degree nodes, and thus the observation of fat-tailed 
degree distributions in spatial networks. 

From a modeling viewpoint, gravity models [131 El US] have long been used to model 
flows in spatial networks. These models focus on the intensity of interaction between 
locations i and j separated by a certain physical distance dij. It has been shown for 
systems as diverse as the International Trade Market [12], human migration [T7], traffic 
flows [TS] or mobile communication between cities [T^ [20] that the volume of interaction 
between distant locations is successfully modeled by 

T,, = N,N,f{d.,), (1) 

where Ni measures the importance of location i, e.g. its population, and the deterrence 
function / describes the influence of space. Equation ([ID emphasizes that the number 
of interactions between two locations is proportional to the number of possible contacts 
NiNj and that it varies with geographic distance, because of financial or temporal cost. 
In many socio-economic systems, / is well fitted by a power law ~ d^^" reminiscent of 
Newton's law of gravity, with population playing the role of a mass. 

Whereas a broad range of models have been specifically developed for spatial net- 
works [211 [22l [231 [2ll [25] , dedicated tools for uncovering useful information from their 
topology are poorly developed. When analyzing spatial networks, authors tend to use 
network metrics where the spatial arrangement of the nodes is ignored, thus disregarding 
that useful measures for non-spatial networks might yield irrelevant or trivial results for 
spatial ones. Important examples are the clustering coefficient, as spatial networks are 
often spatially clustered by nature, and degree distribution, where high degree nodes are 
suppressed by long distance costs. This observation underlines the need for appropri- 
ate metrics for the analysis and modeling of networks where spatial constraints play an 
important role [261 [2711^- 

This need is particularly apparent in the context of community detection. The detec- 
tion of communities (modules or clusters) is a difficult task which is important to many 
fields, and it has attracted much attention in the last few years [221 [201 [311 [321 [331 [3^ [35]. 
In a nutshell, modules are defined as sub-networks that are locally dense even though the 
network as a whole is sparse. Community detection is a central tool of network theory 
because revealing intermediate scales of network organization provides the means to draw 
readable maps of the network and to uncover hidden functional relations between nodes 
[32] • In the case of spatial networks, important practical applications include: i) the de- 
sign of efficient national, economical or administrative borders based on human mobility 
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or economical interactions instead of historical or ad- hoc reasons [36], [371 EHl [39] ; ii) the 
modeling of historical or pre-historical interactions based on limited archaeological evi- 
dence llQl [H] ; iii) the identification of functionally related brain regions and of principles 
leading to global integration and functional segregation [12| H3] . 

In practice, the current state-of-the-art for finding modules in spatial networks H5] 
is to optimize the standard Newman-Girvan modularity which, as we argue below, over- 
looks the spatial nature of the system. In most cases, this scheme produces communities 
which are strongly determined by geographical factors and provide poor information 
about the underlying forces shaping the network. For instance, social and transportation 
networks are typically dominated by low cost short-ranged interactions leading to mod- 
ules which are compact in physical space. As a result, modularity optimization is blind 
to spatial anomalies and fails to uncover modules determined by factors other than mere 
physical proximity. This point brings us to the central question of our work: In spatial 
networks, how can one detect patterns that are not due to space? In other words, are 
observed patterns only due to the effect of spatial distance, because of gravity-like forces, 
or do other forces come into play? If that is the case, can one go beyond a standard 
network methodology in order to uncover significant information from spatial networks? 

1 Social Networks and Space 

In order to illustrate these concepts and to clarify the goal of this paper, let us elaborate 
on social networks, where the dichotomy between network and space has been studied 
for decades. On the one hand, research has attempted to explain the organization of 
social networks purely in terms of the structural position of the nodes. Structural mech- 
anisms underpinning the existence of social interactions include triadic closure [16], link 
reciprocity [17] and reinforcement [H]. On the other hand, research has identified or- 
dering principles that explain edge creation in terms of non-structural attributes, mainly 
homophily [IH] and focus constraint ^U\. Homophily states that similarity, e.g. in terms 
of status or interests, fosters connection [19], as similar people tend to select each other, 
communicate more frequently and develop stronger social interactions [51]. The second 
ordering principle is focus constraint [50], which refers to the idea that social relations 
depend on opportunities for social contact. A dominant factor for focus constraint is 
geographic proximity, which offers opportunities for face-to-face interaction and encoun- 
ters between individuals [52| 153] . Focus constraint thus depends indirectly on distance 
through its dependence on transportation networks which themselves typically exhibit a 
gravity law. 

Although homophily and focus constraint are different mechanisms, they are often 
inter-related, because frequent contacts drive groups towards uniformity, through social 
influence, and that alike individuals tend to live in the same neighborhoods [51] • More- 
over, both aspects can be seen as originating from proximity in a high-dimensional social 
space, which summarizes people's interests and characteristics, i.e. nodes have a tendency 
to connect with neighbouring nodes in social space [55]. When uncovering modules of 
strongly connected nodes in complex networks, one deals with an extremely intricate 
situation where structural and non-structural effects, including homophily and focus con- 
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straint, are mingled. Modules uncovered by community detection are thus underpinned 
by an uncontrolled mixture of possibly antagonistic forces, from which few conclusions 
can be drawn [56]. Our aim is the following: when the spatial positions of the nodes are 
known, as more and more often is the case, is it possible to take out the effect of space in 
order to identify more clearly homophilious effects and thus hidden structural or cultural 
similarities. 

2 Modularity and Space 

Let us now introduce the notations and formalize the problem of community detection. In 
the following, we focus on weighted, undirected networks characterized by their adjacency 
matrix A. By definition, A is symmetric and Aij is the weight of the link between i and j. 
The strength of node i is defined as ki = Aij; m = Aij/2 is the total weight in the 
network. The distance between nodes i and j is denoted by dij. From now on, by distance, 
we mean Euclidian distance between nodes when measured on the embedding space, and 
not network distance, which is the number of edges traversed along the shortest path 
from one vertex to another. As discussed above, the nature of space and its associated 
distance may be abstract, i.e. affinity in a social network, or physical, i.e. geographical 
distance between cities. 

The fundamental idea behind most community detection methods is to partition the 
nodes of the network into modules. Contrary to standard graph partitioning algorithms, 
the detection of communities is performed without a priori specifying the number of 
modules nor their size, and aims at uncovering in an automated way the meso-scale 
organization of the network [31]. Behind most community detection methods, there is a 
mathematical definition measuring the quality of a partition. The widely-used modularity 
\57\ of a partition V measures if links are more abundant within communities than would 
be expected on the basis of chance, namely 

Q = (fraction of links within communities) 

— (expected fraction of such links) (2) 

In a mathematical expression, modularity reads 

(3) 

where i, j G C is a summation over pairs of nodes i and j belonging to the same commu- 
nity C of V and therefore counts links between nodes within the same community. 

What is meant by chance, i.e. the null hypothesis, is an extra ingredient in the def- 
inition [58] and is embodied by the matrix Pij. Pij is the expected weight of a link 
between nodes i and j over an ensemble of random networks with certain constraints. 
These constraints correspond to known information about the network organization, i.e., 
its total number of links and nodes, which has to be taken into account when assessing 
the relevance of an observed topological feature. In general, if Aij is symmetric, Pij is 
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also chosen to be symmetric and one also imposes that the total weight is conservecE], 
i.e. '^■j Aij = "^-j Pij = 2m. Beyond these basic considerations, different null models 
can be constructed depending on the network under consideration [59], [60], [61] . The most 
popular choice, proposed by Newman and Girvan (NG) [57] is 

i^^^ = kikj/2m, then Q = Qng- (4) 

where randomized networks preserve the strength of each node. Constraining the node 
strengths goes along the view that the network is well-mixed, in the sense that any 
node can be connected to any node and that only connectivity matters. In that case, 
node strength is a good proxy for the probability of a link to arrive on a certain node. 
Different types of heuristics can be developed in order to approximate the optimal value 
of the corresponding NG modularity [58] [62] [63] [64] • These methods have been shown to 
produce useful and relevant partitions in a broad class of systems [31], even if modularity 
suffers from limitations such as resolution limit [65] and a possible high degeneracy of its 
landscape [66] [67]. 

The NG null-model only uses the basic structural information encoded in the adja- 
cency matrix. Therefore, it is appropriate when no additional information on the nodes 
is available but not when additional constraints are known. In networks where distance 
strongly affects the probability for two nodes to be connected, a natural choice for the 
null model is inspired by the afore-mentioned gravity models 

P^^ = N,N,f{d.,) (5) 

where iVj is, as in ([1]), a notion of importance of node i and where the deterrence function 

Z^i,j\dij=d ^^i^^j 

is the weighted average of the probability Aij/{NiNj) for a link to exit at distance d. 
It is thus directly measured from the datc(§ and not fitted by a determined functional 
dependence, as is often the case [15]. By construction, the total weight of the network is 
conserved as required. Depending on the system under scrutiny, Ni may be the number 
of inhabitants in a city or the degree of a node when it corresponds to a single person 
in a social network. It is worth mentioning that in the latter case and if the embedding 
in space does not play a role, i.e. where f{d) is fiat, the standard NG model is exactly 
recovered (see SI). 

From now on, let us denote by Qspa the version of modularity whose null model 
Pfj^^ is given by ([5]). Qspa incorporates non-structural information about the nodes, i.e. 
their position in physical space. By definition, it favours communities made of nodes i 
and j such that A^j — Pf^^ is large, i.e. pairs of nodes which are more connected than 
expected for that distance. Compared to QnG) Qspa tends to give larger contributions to 
distant nodes and its optimization is expected to uncover modules driven by non-spatial 
factors. 



^This constraint can be relaxed in order to change the characteristic size of the network and thus to 
tune the resolution at which communities are uncovered [68j (see SI) 

^In practice, when analyzing empirical data, the distance between 2 cities is binned such as to 
smoothen f{d). The dependence of our results on bin size is explored in SI. 
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3 Numerical validation 



3.1 Belgian mobile phone data 

In order to compare the partitions obtained by optimizing Qng and Qspa, let us first 
focus on a Belgian mobile phone network made of its 571 communes (the 19 communes 
forming Brussels are merged into one) and of the symmetrized number of calls ^Li 
between them during a time period of 6 months (see [38] for a more detailed description 
of the data). This network is aggregated from the customer-customer communication 
network of a large mobile phone provider by using the billing address associated to each 
customer. The number of customers in each commune i is given by Ni. This network 
provides an ideal test for our method because of the importance of non-spatial factors 
driving mobile phone communication, namely the existence of two linguistic communities 
in Belgium|f|: a Flemish community and a French community mainly concentrated in the 
North and the South of the country respectively. As reported in [38], when the weights 
between communes are given by the average duration of communication between people 
in i and j, a standard NG modularity optimisation recovers a bi-partition that closely 
follows the linguistic border. 

Both versions of modularity are optimized using the spectral method described in 
[6l] . Visualization of the results are shown in Fig{TJ The NG modularity uncovers 18 
spatially compact modules, similar to those observed in other spatially extended net- 
works and mainly determined by short-range interactions between communes. Although 
this partition coincides with the linguistic separation of the country [3B], the unaware 
would not discover the existence of two linguistic communities only from Fig{T] The 
spatial modularity uncovers a strikingly different type of structure: an almost perfect 
bipartition of the country where the two largest communities account for about 75% of 
all communes (see SI for more details) and nicely reproduce the linguistic separation of 
the country. Moreover, Brussels is assigned to the French community, in agreement with 
the fact that ~ 80% of its population is French speaking, and despite the fact that it is 
spatially located in Flanders. The remaining smaller communities (not bigger than 10 
communes each) originate from the constraints imposed by a hard partitioning, which is 
blind to overlapping communities and might thus misclassify Flemish communes strongly 
interacting with Brussels and communes that have mixed language populations. A simi- 
lar bipartition is found by considering only the signs of the dominant eigenvector of the 
modularity matrix (see SI). 

3.2 Statistical tests 

The values for the optimal modularities can be found in Table [TJ It is important to stress 
that a direct comparison of Qng and Qspa is meaningless, as modularity is a way to 
compare different partitions of the same graph and so its absolute value is inconsequential. 
Moreover, the value of modularity is expected to be lower when its null model is closer to 
the real structure of the data, as it is the case for Qspa- In order to assess the significance 

■^There also exist a German-speaking community made of only 0.73% of the national population 
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Figure 1: Decomposition of a Belgian mobile phone network into communities (see main 
text). Each node represents a commune and its size is proportional to its number of clients 
Ni. (Top figure) Partition into 18 communities found by optimizing NG modularity. 
(Bottom figure) Partition into 31 communities found by optimizing Spa modularity. 
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Table 1: ^-scores for the modularity measured on the original data-set compared to the 
modularity values measured on the randomized data. 

of the uncovered partitions, one needs instead to resort to statistical tests by comparing 
modularity with that of an ensemble of random networks |62j . 

Two types of random networks are constructed: i) Networks where weights are ran- 
domized. Starting from the empirical /((i), we generated weights between two communes 
i and j according to a binomial of mean pNiNjf{dij). In the following, we chose p = 1, 
thus conserving (up to some fluctuations) the total number of calls in the system and 
the spatial dependence between nodes. Let us keep in mind that p allows to tune the 
importance of finite size fluctuations and that Aij/p = NiNjf{dij)) in the limit p — t- oo. 
ii) Networks where the geographical position of the nodes is randomized while leaving the 
weights unchanged. This second ensemble of random networks is radically different from 
the first one because it keeps the topology of the network unaffected and only randomizes 
node attributes. Since NG does not make use of geographical information, it is unaffected 
by this reshuffling. By construction, the effect on Spa is to make space less important by 
changing the function f{d), thus leading to an expression closer to NG (see SI). For each 
type of randomization we produce = 100 networks and optimize their modularities 
Qng and Qspa- 

The significance of the partitions found in the original data is first evaluated by 
comparing their modularity with that of the randomized data through a 2;-score [52], 
defined as 

Q-{Q) random /«\ 

^ ~ ) [' ) 

a 

where a is the standard deviation across 100 realizations. Results are summarized in Table 
[1] and clearly show that the original data is significantly more modular than networks 
where the weights are randomized. The ^-score is an order of magnitude larger for the 
spatial modularity. For the spatial randomization, in contrast, the 2;-score is negative, 
which reflects the fact that useful information is lost by randomizing node positions and 
that the resulting randomized null-model is further away from reality than the original. 

As a next step, we focus on the variability across the uncovered partitions. This is 
done by using normalized variation of information (VI) ^9], which is a measure of the 
distance between partitions. VI is equal to only when two partitions are identical and 
is between and 1 otherwise. Results are summarized in Table |5] where we observe 
that partitions obtained from NG and Spatial are genuinely different. In the case of 
weight randomization, the important point is that VI between partitions uncovered in 
random networks is much smaller for NG (0.09) than for Spa (0.58), thus indicating that 
very similar partitions are found by NG across random networks, i.e. only due to spatial 
interactions between communes. Another interesting point is the high similarity between 
partitions found by NG in the original data and by Spa in the spatially randomized 
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Table 2: Average VI measured between the partition found on the original data-set and 
the randomized ones (Orig-Rand) and the average VI among the randomized data-set 
(Rand-Rand) for both null-models and randomization procedures. 

networks, as their VI is found to be equal to 0.16, in agreement with the fact that Spa 
becomes similar to NG when space is irrelevant B This observation is confirmed by the 
similar values of VI between the partitions found by NG and Spa in the original data, as 
shown in FiglH i.e. 0.38, and between partitions found by Spa in the original data and 
in the spatially randomized data (0.35 in Table [2]). 

3.3 Gravity Model Benchmark 

In order to test the validity of our method in a controlled setting, let us now focus on 
computer-generated benchmarks for spatial, modular networks. The underlying idea is 
to build spatially-embedded random networks where the probability for two nodes to 
be connected depends on their distance, as observed in real-world examples, and on the 
community to which they are assigned. We implement benchmarks in the simplest way 
by throwing 100 nodes at random in a two dimensional square of dimension 100 x 100 and 
by randomly assigning them into two communities of 50 nodes. Contrary to the previous 
example, where nodes (communes) could have different sizes, we assume that all nodes 
have the same size. The probability that a link exists between nodes i and j has the form 

X{ci,Cj) 

where q is the community of node i. The function A(cj,Cj) determines the community 
linkage. By definition, it is equal to 1 if q = Cj and Adifrerent otherwise. When AdiSerent = 0, 
only nodes in the same community are connected, while no distinct communities are 
present when Adigerent = 1. A normalization constant, Z, ensures that Ylii>jPij — ^■ 
These networks, directly inspired by gravity models, are built by placing L = pN{N—l)/2 
links with probability pij, where p > determines the density of links in the network. 
Multiple links are allowed and interpreted as weights. The parameter p controls finite-size 
fluctuations around the expected number of edges Lpij. 

In order to compare the efficiency of Qspa and Qngj we generated one realization 
of the random model for different values of Adifrerent £ [0,1] and p G [0.01,100], and 
optimized their modularity. As a measure of the quality of the uncovered partitions, 
we compared them with the known bipartition of the network by using normalized VI. 
Our simulations show that Qspa outperforms Qng and that the improvement becomes 

is important to stress that the spatial randomization does not entirely remove the effect of space 
on network connectivity because self-loops, i.e. intra-commune links, are preserved. 



9 



100 ^ 

different 




Figure 2: Variation of information over the (Adifferent, p) parameter space for Spa (top 
figure) and NG (bottom figure) wlien tested on tfie spatial benchmark. Spa is able to 
recover the correct communities over a wide range of parameters' values, while NG fails 
to find the correct communites almost as soon as the interaction Adifferent is turned on. 

larger and larger as the density of links is increased (see FigI2]). In the limit p — oo, 
where fluctuations become negligible, our simulations show that Spa perfectly identifies 
the correct communities for any Adifferent < 1 while NG fails even for small values of 
Adifferent- It is also interesting to note that results presented in Figj2]are obtained for single 
realizations of the random networks, i.e. as when dealing with empirical data sets one does 
not analyze an ensemble of networks, and yet the precision of Spa is significantly better 
than that of NG (results smoothed by averaging over several realizations are presented 
in SI). 

4 Discussion 

Despite the increasing availability of affordable long distance travel and new communi- 
cation media, the "death of distance" [70] has been greatly exaggerated [HI [71]. Fur- 
thermore, the emergence of new technologies entangling physical and virtual worlds has 
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stimulated new research and produced new applications for social and human mobility 
networks embedded in space [72]. This importance of space is not limited to social net- 
works as a broad range of economical and biological are also spatially embedded, with 
strong consequences on their topological organization. The main purpose of this paper 
has been to find new ways to uncover significant patterns in spatial networks. To do 
so, we have taken advantage of the flexibility of a quantity called modularity defined 
for community detection. Modularity incorporates a null model which represents what 
is expected by chance, namely the expected probability that two nodes are connected. 
Unlike the standard null model, we incorporate non-structural attributes into our null 
model and use this as a comparison with empirical data. By doing so, we construct null 
models which portray more closely the network under scrutiny and provide the means 
to exploit known attributes, e.g. spatial location, in order to uncover unknown ones, e.g. 
homophilious relations. 

We believe that our general framework is suitable for a wide range of networks and 
that it opens avenues of quantitative exploration of spatially distributed systems. Inter- 
esting lines of research include the development of more general null models, for instance 
by interlacing structural and non-structural information, and the detection of hierarchies 
in spatial networks either by tuning the resolution of modularity [68] or by looking for 
local maxima of the modularity landscape [66] (see SI for more details). Moreover, our 
methodology is not limited to situations where distance is measured in physical space as 
it may be applied whenever one can use node attributes to define a separation between 
nodes. For instance in many social networks age may be a dominant factor, yet by build- 
ing a null model on the age difference between actors, other types of relationships may 
be revealed for little extra computational effort. A further advantage is that by incorpo- 
rating relevant information, a partitioning approach can be applied even if modules are 
pervasively overlapping [721 El]. 
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Supplementary Information 

This supplement to the paper "Uncovering space-independent communities in spatial 
networks" contains detailed information on relations between the spatial null model and 
the standard NG null model, multi-scale modularity, data-sets, effect of bin size on results, 
optimal bi-partitions of the mobile phone networks, randomizations used to assess the 
significance of the results. 

A Spatial vs Newman-Girvan null model 

As discussed in the main text, the null model Pij is a crucial ingredient of modularity 



defined as 




(9) 
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The most standard choice is 



P^G ^ ^^^./2m (10) 

where the probabihty for 2 nodes to be connected is proportional to their degree. Our 
spatial null model incorporates non-structural ingredients, namely a dependence on the 
physical distance dij between 2 nodes 

P^f^ = N,N,f{d.,) (11) 

where Ni is a measure of the importance of node i and where the deterrence function 

V , TV- TV- 

is measured from the empirical data. This expression directly comes from the constraint 

i,j\dij=d i,j\dij=d 

that the total weights between nodes at a certain distance is preserved. When analysing 
the mobile phone network, we have taken iVj as the number of clients in commune i, in 
analogy with simple versions of gravity models. In that case, the above expression for 
f{d) is a weighted average of the probability Aij/{NiNj) to have a call between clients in 
i and in j. 

An interesting choice for TVj is to chose the degree itself, i.e. TVj = ki such that the set 
of null models 

Pt = k^k,f{d,,) (14) 

includes Pfj'^- Indeed, if f{d) does not depend on d, one finds f{d) = l/2m and P^j^^ 
reduces to P^^. 

Finally, we would like to briefly introduce a possible further generalization of the 
modularity function. Even when dealing with systems with strong spatial constraints, 
one might want to favor the importance of topological effects over spatial ones. A way of 
weighing both aspects in the modularity function is to introduce a mixing parameter, ^, in 
order to interpolate between the two previous null models, Pij{^) = {C,P^j^'^ + {l — ^)P^G). 



B Multi-scale Modularity 

Modularity optimization suffers from the limitation of producing one single partition, 
which is not satisfactory when dealing with multi-scale or hierarchical systems, that is 
systems made of (typically nested) modules at different scales. Different methods have 
been proposed to overcome this limitation [75]. A first naive approach consists in re- 
applying modularity optimization on the communities found in the whole system. This 
approach provides a first guess but has the drawbacks of neglecting the global organization 
of the system when uncovering finer modules and of being unable to uncover coarser 
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partitions than those obtained by the original modularity optimization. A second set of 
methods looks for local maxima of the modularity landscape, which has been shown to 
produce to modules at different scales [66] . Finally, a third class of methods is based on 
multi-scale quality functions where a resolution parameter is incorporated such as to tune 
the characteristic size of the modules. A popular quantity is the parametric modularity 
introduced by Reichardt and Bornholdt [68] 




where 7 plays the role of a resolution parameter. Increasing 7 tends to decrease the 
characteristic size of the modules in the optimal partition. 

Because Qspa only differs from standard modularity by the choice of its null model, 
all three approaches can directly be applied to search for multi-scale modules in spatial 
networks. While an exhaustive analysis of such hierarchical organization is beyond the 
scope of the current work, we have performed a preliminary re-decomposition of the two 
largest communities found in the mobile phone network (see FiglS]). On finds Q = 0.019 
and z score= 91 in the case of the Northern community, and Q = 0.064 and z score= 425 
in the case of the Southern community (the z score is calculated for the ensemble of 
random networks where weights are randomized). Values of the z score are high but 
smaller than in the whole system {z score= 803). Moreover, the higher value observed 
in the Southern community than in the Northern community is expected due to the 
presence of bilingual Brussels in the former community. These results confirm that the 
linguistic division is the dominant factor, but also suggest that more regional factors (e.g. 
importance of local dialects, for instance in the Flemish community, cultural differences 
between cosmopolitan Brussels and the more rural Walloon, etc.) still play a role and 
lead to observable, finer sub-divisions of the country. 



Figure 3: Decomposition of the two main communities found in the Belgian mobile 
phone network into sub-communities (see main text). Each node represents a commune 
and its size is proportional to its number of clients A^,. (Left figure) Northern community. 
(Right figure) Southern community. 





(15) 
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C Weights in the Belgian mobile phone data 



In our analysis of the mobile phone data, we have considered the fully connected matrix 
{Aij}fY=i where Aij is the total number of calls between users in commune i and in 
commune j. Different types of weights could have been chosen for this aggregated network 
where node correspond to communes instead of individual users. In [38], the authors focus 
on another network where weights Aij/ (NiNj) correspond to the probability that users in 
i and j have called each other. This sensible choice gives, on average, the same importance 
to each commune and thus removes the effect of heterogeneity coming from different sizes 
of communes. In this article, we have instead preferred the first option, mainly for 2 
reasons: 

- i) One of the aims of modularity is to properly account for the importance of nodes 
in the null model, thereby producing balanced modules in terms of this measure of im- 
portance [7B] . Because the definition of a proper null model is at the heart of this paper, 
we have preferred to preserve a strong heterogeneity (Fig. H]) in the system and to let the 
definition of modularity "deal with it" . 

-ii) By focusing on a meta-network where the weights of the links between communes 
is the sum over the links between their users, the same importance is given to each user. 
More importantly, modularity at the commune-level is related to modularity at the user- 
level: by optimizing the modularity of A^j, one finds the best partition of the user network 
with the constraint that users in the same commune must be in the same community [77] . 




Figure 4: Zipf plot of the commune sizes. The system is highly heterogeneous with several 
orders of magnitude between largest and smallest communes. 



D Size of the communities 

In the main text, we point out that the two largest communities found using the spatial 
null-model account for about 75% of all communes. The remaining communes are as- 
signed to 29 small communities, most of them close to Brussel. This can be attributed to 
the blindness of the algorithm we used to overlapping communities and the strong inter- 
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Figure 5: Sizes of the communities found by Spa (full blue line) and NG (dashed red line). 
The size of each community is measured by the number of communes it contains. In the 
partition found by Spa, two communities are large while the others are of negligible size. 
In the partition found by NG, all communities are of similar size. 



,x 10 




10 15 20 
Community 



Figure 6: Sizes of the communities found by Spa (full blue line) and NG (dashed red 
line). The size of each community is measured by the number of customers living in it. 
The labelling of the communes is the same as in the previous figure. The partition found 
by Spa also gives two large communities while the others are of negligible size. In the 
partition found by NG, all communities are again of similar size. 



action of Flemish speaking communes with Brussel. To clearly illustrate this point, we 
plot in Fig. |5]the size of the communities found by the two null-models. This plot clearly 
shows that the communities found by Spa other than the two largest are of negligible size. 
The sizes of the communities found by NG on the other hand are rather homogeneous. 
Fig. [H] shows the size of the communities in term of number of customer. Again the two 
largest communities found by Spa account for more than 70% of the customers, while 
NG again divides the Belgian population into communities fairly similar in size. 
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Figure 7: Deterrence function f{d) for different size of bins. Solid lines: green: 1 [km], 
red: 2 [km], blue: 5 [km]. Dashed lines: green: 10 [km], red: 20 [km], blue: 50 [km]. 
Dashed and dotted line: green: 100 [km], red: 200 [km]. 
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Figure 8: Averaged normalised variation of information as a function of bin size. The 
minimum is reached at 5 [km]. 

E Binning distance 

The evaluation of f{d) depends on the size of the bins used to measure distances. Two 
extreme cases are 1 [km] and 200 [km] (the largest distances in Belgium are of the order 
of 300 [km] and we need at least two bins). To choose the most appropriate size for the 
bins in that range, we computed the deterrence function f{d) and the partitions obtained 
for 8 different bin sizes s: 1, 2, 5, 10, 20, 50, 100 and 200 [km]. The different deterrence 
functions are shown in Fig. [71 There is no clear discrimination for distances smaller than 
5 [km] and the noise in the tail of the distribution is negligible from 20 [km]. Considering 
the size of Belgium, the number of communes (571) and the typical distance between 
them a bin size of 5 [km] is a reasonable choice |19] . 

In order to support this choice, we have also computed the average normalised vari- 
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Distance [5km] 

Figure 9: Solid (blue) line: f{d) for the Belgian mobile phone network, dashed (red) line: 
f{d) after the positions' randomisation. 

ation of information {V{s)) = ■^J^s'j^s'^i^^^') between the partition at bin size s and 
those at other bin sizes (see Fig. |8]). The size that is closest to all the others, thus the 
most representative of the system, is 5 [km]. 

In Figure IH we plot f{d) when measure with bin sizes of 5 [km] in the original data 
and in a the same network where positions are randomized. 

F Gravity Model Benchmark: Averages 

In the main text, we tested how close uncovered partitions were from the known underly- 
ing community structure as a function of the interaction strength and the density of links. 
The results for one single realization of the benchmarks were overwhelmingly in favour 
of Spa. Here we produce the same graphs, but averaged over 100 different realisations 
of the random networks, thus leading to a smoother surface. Fig. [TOl We also present a 
"phase diagram" (Adiffcrent, p) in which we highlight values where the partitioning ceases 
to be perfect (i.e. the normalised variation of information becomes larger than 0), Fig. 
[TTl One observes that Spa offers a perfect reconstruction over a significantly broader 
range of parameters than NG. N.B. Visualizations of the benchmarks are shown in Fig. 

m 

G Bipartition of the mobile phone data 

From modularity, it is always possible to partition a network into two communities by 
assigning each node to a community according to its sign in the leading eigenvector of 
the modularity matrix. In this procedure, a negative second largest eigenvalue indicates 
that this bipartition is a good approximation to the full optimisation of modularity [78] . 
When applied to the Belgian mobile phone network, the second eigenvalue for NG mod- 
ularity matrix is positive, contrary to Spa, thereby suggesting that a bipartition is a 
reasonable approximation to the full optimization for Spa. This is confirmed visually in 
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Figure 10: Averaged normalised variation of information over the (AdifTcrcnt; p) parameter 
space for Spa (upper figure) and NG (lower figure) when tested on the spatial benchmark. 
Spa is able to recover the correct communities over a wide range of parameters' values, 
while GN fails to find the correct communities almost as soon as the interaction Adiffercnt 
is turned on. 
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Figure 11: Adifferent—p parameter space separation. Solid red line: NG, solid blue line: Spa. 
Below the line, the partition found matches the real one, above there are discrepancies. 
Spa is able to recover the original communities on a much larger range of parameters 

Fig. [131 where NG picks Brussels and its neighborhood, and the rest of Belgium as the 
best bipartition, while Spa gives a North-South bipartition consistent with the linguistic 
bipartition of the country. 
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Figure 12: Illustration of the probabilities used for a Gravity Model Benchmark made of 
40 nodes with Adiffcrent = 0.5. Only 20% of links with highest probability are represented. 
Hidden communities are represented by squares and circles. Different types of interactions 
are highlighted by their linestyle: red-red: dotted, blue-blue: dashed, red-blue: solid. 




Figure 13: Decomposition of a Belgian mobile phone network into communities (see main 
text). Each node represents a commune and its size is proportional to its number of clients 
Ni. (Left figure) Partition into 18 communities found by optimizing GN modularity. 
(Right figure) Partition into 31 communities found by optimizing Spa modularity. 
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